About me

Hi! I’m Ying. I am an Assistant Professor in the Department of Statistics and Data Science at the Wharton School, University of Pennsylvania.

I obtained my PhD in Statistics from Stanford University in 2024, advised by Professors Emmanuel Candès and Dominik Rothenhäusler. Prior to that, I studied Mathematics at Tsinghua University. Before joining Wharton, I spent one year as a Wojcicki-Troper Postdoctoral Fellow at Harvard Data Science Initiative, where I had the fortune to work with Professor José Zubizarreta and Professor Marinka Zitnik.

I currently help organize the Online Causal Inference Seminar.


Research interests

I work on statistical problems related to two main themes:

  • Uncertainty quantification
    I study distribution-free inference for quantifying and controlling the uncertainty of black-box AI models. My recent interest is statistical inference guarantees in scenarios where scientific discoveries are driven/generated by AI predictions, motivated by applications in predictive screening in drug discovery, generative medical AI, and automated scientific discovery with AI agents.
    This often necessitates inference across multiple, decision-coupled samples, and leads to new conformal prediction methods with selective inference and causal inference capabilities.


News

  • Sep 2025: Our Pessimistic Policy Learning paper is selected by Annals of Statistics to present at the journal-to-conference track at NeurIPS 2025!

  • Sep 2025: Our paper on the predictive role of covariate shift in generalizability is accepted to PNAS! Analyzing two large-scale multi-site replication projects, it suggests a predictive, instead of explanatory, role of covariate shift: it informs the strength of unknown conditional shift, even though it does not explain away all the distribution shift between sites. See my blog post here!

  • May 2025: I’m organizing an invited session on generalizability, transportability, and distribution shift at ACIC 2025!

  • Apr 2025: I gave a talk on our POPPER agent framework at the International Seminar on Selective Inference! [slides] [recording]

  • Feb 2025: Imagine LLM agents for scientific discovery—agents that autonomously gather knowledge by creative reasoning and flexible tool use. How to ensure the soundness of what they acquire? We propose POPPER, a framework where LLM agents design sequential experiments, collect data, and accumulate statistical evidence to validate a free-form hypothesis with error control!

  • Sept 2024: Outputs from black-box foundation models must align with human values before use. For example, can we ensure only human-quality AI-generated medical reports are deferred to doctors? Our paper Conformal Alignment is accepted to NeurIPS 2024!

  • Sept 2024: My paper on optimal variance reduction in online experiments (2021 internship project at LinkedIn) receives the 2024 Jack Youden Prize for the best expository paper in Technometrics! Thank you, ASQ/ASA!

  • March 2024: How to quantify the uncertainty for an “interesting” unit picked by a complicated, data-driven process? Check out JOMI, our framework for conformal prediction with selection conditional coverage!

  • Sept 2023: I’ll be giving a seminar at Genentech on leveraging Conformal Selection [1, 2] for reliable AI-assisted drug discovery.

  • Sept 2023: Scientists often refer to distribution shifts when effects from two studies differ, e.g. in replicability failure. Do they really contribute? See our preprint for a formal diagnosis framework. Play with our live app, or explore our data repository! I gave an invited talk about it in the Causality in Practice Conference.

Beyond academics, I love traveling and photography in my free time. See my photography gallery!


Education

Recent posts